What is ahocorasick?
The ahocorasick npm package is an implementation of the Aho-Corasick algorithm, which is used for searching multiple keywords in a text simultaneously. It is particularly useful for tasks that involve pattern matching, such as text search, spam filtering, and DNA sequence analysis.
What are ahocorasick's main functionalities?
Keyword Initialization
This feature allows you to initialize the Aho-Corasick automaton with a list of keywords. The automaton will then be able to search for these keywords in any given text.
const AhoCorasick = require('ahocorasick');
const ac = new AhoCorasick(['he', 'she', 'his', 'hers']);
Search Text
This feature allows you to search a given text for the initialized keywords. The search method returns an array of matches, where each match includes the keyword and its position in the text.
const AhoCorasick = require('ahocorasick');
const ac = new AhoCorasick(['he', 'she', 'his', 'hers']);
const results = ac.search('ushers');
console.log(results);
Add Keywords
This feature allows you to add more keywords to the Aho-Corasick automaton after it has been initialized. This is useful if you need to dynamically update the list of keywords.
const AhoCorasick = require('ahocorasick');
const ac = new AhoCorasick(['he', 'she']);
ac.add('his');
ac.add('hers');
Other packages similar to ahocorasick
string-search
The string-search package provides various string search algorithms, including the Aho-Corasick algorithm. It is more versatile as it offers multiple search algorithms, but it may be less optimized for the Aho-Corasick algorithm specifically compared to the ahocorasick package.
ahocorasick
Implementation of the Aho-Corasick string searching algorithm, as described in the paper
"Efficient string matching: an aid to bibliographic search".
Installing / Running
npm install ahocorasick
for nodejs:
var AhoCorasick = require('ahocorasick');
var ac = new AhoCorasick(['keyword1', 'keyword2', 'etc']);
var results = ac.search('should find keyword1 at position 19 and keyword2 at position 47.');
// [ [ 19, [ 'keyword1' ] ], [ 47, [ 'keyword2' ] ] ]
or for browsers:
<script src="/src/main.js"><script>`
<script>
var ac = new AhoCorasick(['keyword1', 'keyword2', 'etc']);
...
</script>
check test/basic.js for more examples.
PS. Note that what's returned is the index of the last characters of the found keywords.
Visualization
See https://brunorb.github.io/ahocorasick/visualization.html for an interactive visualization of the algorithm.
License
The MIT License